[pull] master from ggerganov:master #143

pull · 2024-10-03T18:31:22Z

See Commits and Changes for more details.

Can you help keep this open source service alive? 💖 Please sponsor : )

Co-authored-by: Johannes Gäßler <[email protected]>

* Add scaffolding for ggml logging macros * Metal backend now uses GGML logging * Cuda backend now uses GGML logging * Cann backend now uses GGML logging * Add enum tag to parameters * Use C memory allocation funcs * Fix compile error * Use GGML_LOG instead of GGML_PRINT * Rename llama_state to llama_logger_state * Prevent null format string * Fix whitespace * Remove log callbacks from ggml backends * Remove cuda log statement

ggml : remove test-backend-buffer ggml : fix CUDA build warnings

* rerank : use [SEP] token instead of [BOS] ggml-ci * common : sanity check for non-NULL tokens ggml-ci * ci : adjust rank score interval ggml-ci * ci : add shebang to run.sh ggml-ci

Co-authored-by: Samuel Morris <[email protected]>

* Single allocation of encode_async block with non-ARC capture in ggml-metal.m * Moving Block_release to the deallocation code * Release encode block when re-setting encoding buffer count if needed * Update ggml/src/ggml-metal.m --------- Co-authored-by: Georgi Gerganov <[email protected]>

* ggml : add metal backend registry / device ggml-ci * metal : fix names [no ci] * metal : global registry and device instances ggml-ci * cont : alternative initialization of global objects ggml-ci * llama : adapt to backend changes ggml-ci * fixes * metal : fix indent * metal : fix build when MTLGPUFamilyApple3 is not available ggml-ci * fix merge * metal : avoid unnecessary singleton accesses ggml-ci * metal : minor fix [no ci] * metal : g_state -> g_ggml_ctx_dev_main [no ci] * metal : avoid reference of device context in the backend context ggml-ci * metal : minor [no ci] * metal : fix maxTransferRate check * metal : remove transfer rate stuff --------- Co-authored-by: slaren <[email protected]>

Flake lock file updates: • Updated input 'flake-parts': 'github:hercules-ci/flake-parts/bcef6817a8b2aa20a5a6dbb19b43e63c5bf8619a?narHash=sha256-HO4zgY0ekfwO5bX0QH/3kJ/h4KvUDFZg8YpkNwIbg1U%3D' (2024-09-12) → 'github:hercules-ci/flake-parts/3d04084d54bedc3d6b8b736c70ef449225c361b1?narHash=sha256-K5ZLCyfO/Zj9mPFldf3iwS6oZStJcU4tSpiXTMYaaL0%3D' (2024-10-01) • Updated input 'flake-parts/nixpkgs-lib': 'https://github.com/NixOS/nixpkgs/archive/356624c12086a18f2ea2825fed34523d60ccc4e3.tar.gz?narHash=sha256-Ss8QWLXdr2JCBPcYChJhz4xJm%2Bh/xjl4G0c0XlP6a74%3D' (2024-09-01) → 'https://github.com/NixOS/nixpkgs/archive/fb192fec7cc7a4c26d51779e9bab07ce6fa5597a.tar.gz?narHash=sha256-0xHYkMkeLVQAMa7gvkddbPqpxph%2BhDzdu1XdGPJR%2BOs%3D' (2024-10-01) • Updated input 'nixpkgs': 'github:NixOS/nixpkgs/1925c603f17fc89f4c8f6bf6f631a802ad85d784?narHash=sha256-J%2BPeFKSDV%2BpHL7ukkfpVzCOO7mBSrrpJ3svwBFABbhI%3D' (2024-09-26) → 'github:NixOS/nixpkgs/bc947f541ae55e999ffdb4013441347d83b00feb?narHash=sha256-NOiTvBbRLIOe5F6RbHaAh6%2B%2BBNjsb149fGZd1T4%2BKBg%3D' (2024-10-04) Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* docs : clarify building Android on Termux * docs : update building Android on Termux * docs : add cross-compiling for Android * cmake : link dl explicitly for Android

* ggml : add backend registry / device interfaces to BLAS backend * fix mmap usage when using host buffers

Signed-off-by: Masanari Iida <[email protected]>

* server : more explicit endpoint access settings * protect /props endpoint * fix tests * update server docs * fix typo * fix tests

* ggml : do not use BLAS with types without to_float * ggml : return pointer from ggml_internal_get_type_traits to avoid unnecessary copies * ggml : rename ggml_internal_get_type_traits -> ggml_get_type_traits it's not really internal if everybody uses it

An updated version will be added in #9787

* perplexity : fix integer overflow ggml-ci * perplexity : keep n_vocab as int and make appropriate casts ggml-ci

slaren and others added 11 commits October 3, 2024 01:49

ggml-backend : add device and backend reg interfaces (#9707)

c83ad6d

Co-authored-by: Johannes Gäßler <[email protected]>

Fixed dequant precision issues in Q4_1 and Q5_1 (#9711)

5639971

rpc : enable vulkan (#9714)

841713e

closes #8536

convert : handle tokenizer merges format from transformers 4.45 (#9696)

e3c355b

ggml-backend : add device description to CPU backend (#9720)

a7ad553

metal : fix compute pass descriptor autorelease crash (#9718)

5d5ab1e

ggml: refactor cross entropy loss CPU impl. (ggml/976)

eee39bd

ggml/ex: calculate accuracy in graph, adapt MNIST (ggml/980)

fabdc3b

sync : ggml

1bb8a64

metal : remove abort (skip) (ggml/0)

d5ed2b9

github-actions bot added examples devops python ggml Vulkan SYCL Nvidia GPU testing script Apple Metal Kompute labels Oct 3, 2024

pull bot removed examples devops python ggml Vulkan SYCL Nvidia GPU testing labels Oct 3, 2024

github-actions bot added Apple Metal Kompute labels Oct 4, 2024

ci : fine-grant permission (#9710)

f3fdcfa

github-actions bot added the nix label Oct 4, 2024

slaren and others added 15 commits October 4, 2024 18:50

ggml : fixes after sync (ggml/983)

ff56576

ggml : remove test-backend-buffer ggml : fix CUDA build warnings

ggml : fix typo in example usage ggml_gallocr_new (ggml/984)

55951c0

sync : ggml

1788077

Add Llama Assistant (#9744)

71967c2

metal : zero-init buffer contexts (whisper/0)

905f548

sync : ggml

58b1669

rerank : use [SEP] token instead of [BOS] (#9737)

8c475b9

* rerank : use [SEP] token instead of [BOS] ggml-ci * common : sanity check for non-NULL tokens ggml-ci * ci : adjust rank score interval ggml-ci * ci : add shebang to run.sh ggml-ci

vulkan : retry allocation with fallback flags (whisper/2451)

b0915d5

Co-authored-by: Samuel Morris <[email protected]>

sync : llama.cpp

b6d6c52

readme : fix typo [no ci]

f4b2dcd

contrib : simplify + minor edits [no ci]

d5cb868

Update building for Android (#9672)

f1af42f

* docs : clarify building Android on Termux * docs : update building Android on Termux * docs : add cross-compiling for Android * cmake : link dl explicitly for Android

github-actions bot added the documentation Improvements or additions to documentation label Oct 7, 2024

slaren and others added 7 commits October 7, 2024 21:55

ggml : add backend registry / device interfaces to BLAS backend (#9752)

6374743

* ggml : add backend registry / device interfaces to BLAS backend * fix mmap usage when using host buffers

scripts : fix spelling typo in messages and comments (#9782)

fa42aa6

Signed-off-by: Masanari Iida <[email protected]>

server : better security control for public deployments (#9776)

458367a

* server : more explicit endpoint access settings * protect /props endpoint * fix tests * update server docs * fix typo * fix tests

examples : remove llama.vim

3dc48fe

An updated version will be added in #9787

perplexity : fix integer overflow (#9783)

e702206

* perplexity : fix integer overflow ggml-ci * perplexity : keep n_vocab as int and make appropriate casts ggml-ci

cmake : do not build common library by default when standalone (#9804)

c81f3bb

github-actions bot added android build labels Oct 9, 2024

teleprint-me closed this Oct 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] master from ggerganov:master #143

[pull] master from ggerganov:master #143

pull bot commented Oct 3, 2024 •

edited

Loading

[pull] master from ggerganov:master #143

[pull] master from ggerganov:master #143

Conversation

pull bot commented Oct 3, 2024 • edited Loading

pull bot commented Oct 3, 2024 •

edited

Loading